Overview

I compared coverage profiles and estimated abundances for CAR and related transcripts using three sets of samples:

 

Getting the data

Data presented below was generated and/or compiled from several sources:

load("data/sample_metrics_data.RData")
load("data/sample_rapmap_data.RData")
load("data/sample_salmon_data.RData")
load("data/sample_tcr_data.RData")

gff_file <- "data/annotation/carPlus.gtf"
xcripts_gtf <- import.gff2(gff_file)

 

Results

Visualizing CAR coverage & abundance

The plots below show read coverage from RapMap mapping across the length of the CAR sequence. Segments in the transcript, corresponding to the gene parts used to build the construct, are depicted by colored boxes in each plot. Transparency (i.e., alpha) is scaled based on the estimated abundance of the CAR transcript (TPM) as measured by Salmon.

 

Bulk libraries from project P89

 

Single-cell libraries from project P89

 

Single-cell libraries from project P85

 

Inspecting abundance of non-CAR transcripts

The following human transcripts (which overlap the CAR sequence) were quantified by Salmon.

xcript_name segment_version
CSF2 GMCSFRss_r1
CD28 CD28tm_r1
CD28 CD28tm_r2
CD28 CD28tm_r3
TNFRSF9 IgG4hinge_r1
CD247 CD3Zeta_r1
CD247 CD3Zeta_r2
EGFR EGFRt_r1
EGFR EGFRt_r2
EGFR EGFRt_r3
EGFR EGFRt_r4
EGFR EGFRt_r5

 

Each plot shows CAR coverage across libraries, but colored based on the estimated abundance of the respective transcript.

 

Attempting to define CAR detection rules

I tried to come up with a relatively simple way to classify whether CAR was detected in a particular library.

 

Binarized CAR expression

expressed: log2(TPM +1) \(\gt\) 0 for CAR transcript

car_expr_tpm nz_cov n_libs
car_expr_tpm FALSE 18
car_expr_tpm TRUE 55
car_expr_tpm NA 1
no_car_tpm FALSE 444
no_car_tpm TRUE 37
no_car_tpm NA 4

 

Quantification-based rule

expressed: log2(TPM + 1) \(\geq\) 2.5 in CAR or EGFRt transcripts OR log2(TPM + 1) \(\gt\) 2 in all CAR or EGFRt transcripts

car_expr_quant nz_cov n_libs
car_expr_quant FALSE 107
car_expr_quant TRUE 58
car_expr_quant NA 2
no_car_quant FALSE 355
no_car_quant TRUE 34
no_car_quant NA 3

 

Coverage-based rule

expressed: \(\geq\) 10 positions with \(\gt\) 0 reads in ANY of CD19scFv, T2A, or EGFRt

car_expr_cov nz_cov n_libs
car_expr_cov FALSE 2
car_expr_cov TRUE 53
no_car_cov FALSE 460
no_car_cov TRUE 39
NA NA 5

 

Session info

## R version 3.2.1 (2015-06-18)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.11.4 (unknown)
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats4    parallel  stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] cowplot_0.6.1        scales_0.4.0         ggthemes_3.0.2      
##  [4] rtracklayer_1.30.4   GenomicRanges_1.22.4 GenomeInfoDb_1.6.3  
##  [7] IRanges_2.4.8        S4Vectors_0.8.11     BiocGenerics_0.16.1 
## [10] viridis_0.3.4        ggplot2_2.1.0        dplyr_0.4.3         
## [13] tidyr_0.4.1          stringr_1.0.0        knitr_1.12.3        
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.4                highr_0.5.1               
##  [3] futile.logger_1.4.1        formatR_1.3               
##  [5] plyr_1.8.3                 XVector_0.10.0            
##  [7] futile.options_1.0.0       bitops_1.0-6              
##  [9] tools_3.2.1                zlibbioc_1.16.0           
## [11] digest_0.6.9               lattice_0.20-33           
## [13] nlme_3.1-126               evaluate_0.8.3            
## [15] gtable_0.2.0               mgcv_1.8-12               
## [17] Matrix_1.2-4               DBI_0.3.1                 
## [19] yaml_2.1.13                gridExtra_2.2.1           
## [21] Biostrings_2.38.4          grid_3.2.1                
## [23] Biobase_2.30.0             R6_2.1.2                  
## [25] XML_3.98-1.4               BiocParallel_1.4.3        
## [27] rmarkdown_0.9.5            reshape2_1.4.1            
## [29] lambda.r_1.1.7             magrittr_1.5              
## [31] GenomicAlignments_1.6.3    Rsamtools_1.22.0          
## [33] htmltools_0.3.5            SummarizedExperiment_1.0.2
## [35] assertthat_0.1             colorspace_1.2-6          
## [37] labeling_0.3               stringi_1.0-1             
## [39] lazyeval_0.1.10            RCurl_1.95-4.8            
## [41] munsell_0.4.3